Search CORE

21 research outputs found

Minimax Estimation of Kernel Mean Embeddings

Author: Muandet Krikamol
Sriperumbudur Bharath
Tolstikhin Ilya
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we study the minimax estimation of the Bochner integral

\mu_k(P):=\int_{\mathcal{X}} k(\cdot,x)\,dP(x),

also called as the kernel mean embedding, based on random samples drawn i.i.d.~from

P

, where

k:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R}

is a positive definite kernel. Various estimators (including the empirical estimator),

\hat{\theta}_n

\mu_k(P)

are studied in the literature wherein all of them satisfy

\bigl\| \hat{\theta}_n-\mu_k(P)\bigr\|_{\mathcal{H}_k}=O_P(n^{-1/2})

with

\mathcal{H}_k

being the reproducing kernel Hilbert space induced by

k

. The main contribution of the paper is in showing that the above mentioned rate of

n^{-1/2}

is minimax in

\|\cdot\|_{\mathcal{H}_k}

and

\|\cdot\|_{L^2(\mathbb{R}^d)}

-norms over the class of discrete measures and the class of measures that has an infinitely differentiable density, with

k

being a continuous translation-invariant kernel on

\mathbb{R}^d

. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of

P

(if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in non-parametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance)

arXiv.org e-Print Archive

MPG.PuRe

PAC-Bayes-empirical-Bernstein inequality

Author: Seldin Yevgeny
Tolstikhin Ilya
Publication venue: Neural Information Processing Systems (NIPS) Foundation
Publication date: 01/01/2013
Field of study

We present PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on combination of PAC-Bayesian bounding technique with Empirical Bernstein bound. It allows to take advantage of small empirical variance and is especially useful in regression. We show that when the empirical variance is significantly smaller than the empirical loss PAC-Bayes-Empirical-Bernstein inequality is significantly tighter than PAC-Bayes-kl inequality of Seeger (2002) and otherwise it is comparable. PAC-Bayes-Empirical-Bernstein inequality is an interesting example of application of PAC-Bayesian bounding technique to self-bounding functions. We provide empirical comparison of PAC-Bayes-Empirical-Bernstein inequality with PAC-Bayes-kl inequality on a synthetic example and several UCI datasets

Queensland University of Technology ePrints Archive

Towards a Learning Theory of Cause-Effect Inference

Author: Lopez-Paz David
Muandet Krikamol
Schölkopf Bernhard
Tolstikhin Ilya
Publication venue
Publication date: 18/05/2015
Field of study

We pose causal inference as the problem of learning to classify probability distributions. In particular, we assume access to a collection

\{(S_i,l_i)\}_{i=1}^n

, where each

S_i

is a sample drawn from the probability distribution of

X_i \times Y_i

, and

l_i

is a binary label indicating whether "

X_i \to Y_i

" or "

X_i \leftarrow Y_i

". Given these data, we build a causal inference rule in two steps. First, we featurize each

S_i

using the kernel mean embedding associated with some characteristic kernel. Second, we train a binary classifier on such embeddings to distinguish between causal directions. We present generalization bounds showing the statistical consistency and learning rates of the proposed approach, and provide a simple implementation that achieves state-of-the-art cause-effect inference. Furthermore, we extend our ideas to infer causal relationships between more than two variables

arXiv.org e-Print Archive

MPG.PuRe

Differentially Private Database Release via Kernel Mean Embeddings

Author: Balog Matej
Schölkopf Bernhard
Tolstikhin Ilya
Publication venue: Proceedings of Machine Learning Research Volume 80:
Publication date: 04/10/2017
Field of study

We lay theoretical foundations for new database release mechanisms that allow third-parties to construct consistent estimators of population statistics, while ensuring that the privacy of each individual contributing to the database protected. The proposed framework rests on two main ideas. First, releasing (an estimate of) the kernel mean embedding of the data generating random variable instead of the database itself still allows third-parties to construct consistent estimators of a wide class of population statistics. Second, the algorithm can satisfy the definition of differential privacy by basing the released kernel mean embedding on entirely synthetic data points, while controlling accuracy through the metric available in a Reproducing Kernel Hilbert Space. We describe two instantiations of the proposed framework, suitable under different scenarios, and prove theoretical results guaranteeing differential privacy of the resulting algorithms and the consistency of estimators constructed from their outputs

arXiv.org e-Print Archive

Apollo (Cambridge)

MPG.PuRe

Spatial Evolutionary Generative Adversarial Networks

Author: Al-Dujaili Abdullah
Arora Sanjeev
Mitchell Melanie
Popovici Elena
Tolstikhin Ilya O
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/05/2019
Field of study

Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse. These pathologies mainly arise from a lack of diversity in their adversarial interactions. Evolutionary generative adversarial networks apply the principles of evolutionary computation to mitigate these problems. We hybridize two of these approaches that promote training diversity. One, E-GAN, at each batch, injects mutation diversity by training the (replicated) generator with three independent objective functions then selecting the resulting best performing generator for the next batch. The other, Lipizzaner, injects population diversity by training a two-dimensional grid of GANs with a distributed evolutionary algorithm that includes neighbor exchanges of additional training adversaries, performance based selection and population-based hyper-parameter tuning. We propose to combine mutation and population approaches to diversity improvement. We contribute a superior evolutionary GANs training method, Mustangs, that eliminates the single loss function used across Lipizzaner's grid. Instead, each training round, a loss function is selected with equal probability, from among the three E-GAN uses. Experimental analyses on standard benchmarks, MNIST and CelebA, demonstrate that Mustangs provides a statistically faster training method resulting in more accurate networks

arXiv.org e-Print Archive

Crossref